NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis

https://doi.org/10.1609/aaai.v39i17.34033

Li, Weikai; Wang, Ding; Ding, Zijian; Sohrabizadeh, Atefeh; Qin, Zongyue; Cong, Jason; Sun, Yizhou (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

High-level synthesis (HLS) is a widely used tool in designing Field Programmable Gate Array (FPGA). HLS enables FPGA design with software programming languages by compiling the source code into an FPGA circuit. The source code includes a program (called ``kernel'') and several pragmas that instruct hardware synthesis, such as parallelization, pipeline, etc. While it is relatively easy for software developers to design the program, it heavily relies on hardware knowledge to design the pragmas, posing a big challenge for software developers. Recently, different machine learning algorithms, such as GNNs, have been proposed to automate the pragma design via performance prediction. However, when applying the trained model on new kernels, the significant domain shift often leads to unsatisfactory performance. We propose a more domain-generalizable model structure: a two-level hierarchical Mixture of Experts (MoE), that can be flexibly adapted to any GNN model. Different expert networks can learn to deal with different regions in the representation space, and they can utilize similar patterns between the old kernels and new kernels. In the low-level MoE, we apply MoE on three natural granularities of a program: node, basic block, and graph. The high-level MoE learns to aggregate the three granularities for the final decision. To stably train the hierarchical MoE, we further propose a two-stage training method. Extensive experiments verify the effectiveness of the hierarchical MoE.
more » « less
Full Text Available
On the Prediction of Tremor Dynamics Motion Using Neural Network

https://doi.org/10.1115/DETC2024-143313

Ding, Zijian; Barry, Oumar (August 2024, American Society of Mechanical Engineers)

Abstract Pathological tremors significantly affect the quality of life for patients worldwide. Rehabilitation exoskeletons serve as one of the solutions to alleviate these pathological tremors, and voluntary motion prediction-based motion planning has been employed to enhance the performance of these devices. This paper presents a method for predicting future voluntary movement in tremor-alleviating rehabilitation exoskeletons that use voluntary motion prediction-based motion planning. In this study, a Convolutional Neural Network and Transformer architecture based neural network work with EMG sensors to predict future voluntary movements. The results show that approach performs well in predicting future voluntary movements, but there is still a limitation to filter out the tremors completely. In summary, we provide a concept for predicting future voluntary movement, which has the potential to improve the effectiveness of rehabilitation exoskeletons in tremor alleviation.
more » « less
Full Text Available
Iceberg: Enhancing HLS Modeling with Synthetic Data

https://doi.org/10.1109/ICLAD65226.2025.00032

Ding, Zijian; Nguyen, Tung; Li, Weikai; Grover, Aditya; Sun, Yizhou; Cong, Jason (June 2025, IEEE)

Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models through pretraining on synthetic data and introduce Iceberg, a synthetic data augmentation approach that expands both large language model (LLM)-generated programs and weak labels of unseen design configurations. Our weak label generation method is integrated with an in-context model architecture, enabling meta-learning from actual and proximate labels. Iceberg improves the geometric mean modeling accuracy by 86.4% when adapt to six real-world applications with few-shot examples and achieves a 2.47× and a 1.12× better offline DSE performance when adapting to two different test datasets. Our open-sourced code is here: https://github.com/UCLA-VAST/iceberg.
more » « less
Full Text Available
Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis

https://doi.org/10.1145/3670474.3685952

Qin, Zongyue; Bai, Yunsheng; Sohrabizadeh, Atefeh; Ding, Zijian; Hu, Ziniu; Sun, Yizhou; Cong, Jason (September 2024, ACM)

In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level description written in C/C++ into a design with low-level hardware description languages that eventually synthesize DSAs on circuits. However, creating a highquality HLS design still demands significant domain knowledge, particularly in microarchitecture decisions expressed as pragmas. Thus, it is desirable to automate such decisions with the help of machine learning for predicting the quality of HLS designs, requiring a deeper understanding of the program that consists of original code and pragmas. Naturally, these programs can be considered as sequence data. In addition, these programs can be compiled and converted into a control data flow graph (CDFG). But existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG, a model that allows interaction between the source code sequence modality and the graph modality in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler’s data flow analysis tasks. Experimental results show that ProgSG reduces the RMSE of design performance predictions by up to 22%, and identifies designs with an average of 1.10× and 1.26× (up to 8.17× and 13.31×) performance improvement in design space exploration (DSE) task compared to HARP and AutoDSE, respectively.
more » « less
Full Text Available
Learning to Compare Hardware Designs for High-Level Synthesis

https://doi.org/10.1145/3670474.3685940

Bai, Yunsheng; Sohrabizadeh, Atefeh; Ding, Zijian; Liang, Rongjian; Li, Weikai; Wang, Ding; Ren, Haoxing; Sun, Yizhou; Cong, Jason (September 2024, ACM)

High-level synthesis (HLS) is an automated design process that transforms high-level code into optimized hardware designs, enabling rapid development of efficient hardware accelerators for various applications such as image processing, machine learning, and signal processing. To achieve optimal performance, HLS tools rely on pragmas, which are directives inserted into the source code to guide the synthesis process, and these pragmas can have various settings and values that significantly impact the resulting hardware design. State-of the-art ML-based HLS methods, such as harp, first train a deep learning model, typically based on graph neural networks (GNNs) applied to graph-based representations of the source code and its pragmas. They then perform design space exploration (DSE) to explore the pragma design space, rank candidate designs using the trained model, and return the top designs as the final designs. However, traditional DSE methods face challenges due to the highly nonlinear relationship between pragma settings and performance metrics, along with complex interactions between pragmas that affect performance in non-obvious ways. To address these challenges, we propose compareXplore, a novel approach that learns to compare hardware designs for effective HLS optimization. compareXplore introduces a hybrid loss function that combines pairwise preference learning with pointwise performance prediction, enabling the model to capture both relative preferences and absolute performance values. Moreover, we introduce a novel Node Difference Attention module that focuses on the most informative differences between designs, enhancing the model’s ability to identify critical pragmas impacting performance. compareXplore adopts a two-stage DSE approach, where a pointwise prediction model is used for the initial design pruning, followed by a pairwise comparison stage for precise performance verification. Experimental results demonstrate that compareXplore achieves significant improvements in ranking metrics and generates high quality HLS results for the selected designs, outperforming the existing state-of-the-art method.
more » « less
Full Text Available
UniSparse: An Intermediate Language for General Sparse Format Customization

https://doi.org/10.1145/3649816

Liu, Jie; Zhao, Zhongyuan; Ding, Zijian; Brock, Benjamin; Rong, Hongbo; Zhang, Zhiru (April 2024, Proceedings of the ACM on Programming Languages)

The ongoing trend of hardware specialization has led to a growing use of custom data formats when processing sparse workloads, which are typically memory-bound. These formats facilitate optimized software/hardware implementations by utilizing sparsity pattern- or target-aware data structures and layouts to enhance memory access latency and bandwidth utilization. However, existing sparse tensor programming models and compilers offer little or no support for productively customizing the sparse formats. Additionally, because these frameworks represent formats using a limited set of per-dimension attributes, they lack the flexibility to accommodate numerous new variations of custom sparse data structures and layouts. To overcome this deficiency, we propose UniSparse, an intermediate language that provides a unified abstraction for representing and customizing sparse formats. Unlike the existing attribute-based frameworks, UniSparse decouples the logical representation of the sparse tensor (i.e., the data structure) from its low-level memory layout, enabling the customization of both. As a result, a rich set of format customizations can be succinctly expressed in a small set of well-defined query, mutation, and layout primitives. We also develop a compiler leveraging the MLIR infrastructure, which supports adaptive customization of formats, and automatic code generation of format conversion and compute operations for heterogeneous architectures. We demonstrate the efficacy of our approach through experiments running commonly-used sparse linear algebra operations with specialized formats on multiple different hardware targets, including an Intel CPU, an NVIDIA GPU, an AMD Xilinx FPGA, and a simulated processing-in-memory (PIM) device.
more » « less
Full Text Available
How Diverse Initial Samples Help and Hurt Bayesian Optimizers

https://doi.org/10.1115/1.4063006

Kamrah, Eesh; Ghoreishi, Seyede Fatemeh; Ding, Zijian “Jason”; Chan, Joel; Fuge, Mark (November 2023, Journal of Mechanical Design)

Abstract Design researchers have struggled to produce quantitative predictions for exactly why and when diversity might help or hinder design search efforts. This paper addresses that problem by studying one ubiquitously used search strategy—Bayesian optimization (BO)—on a 2D test problem with modifiable convexity and difficulty. Specifically, we test how providing diverse versus non-diverse initial samples to BO affects its performance during search and introduce a fast ranked-determinantal point process method for computing diverse sets, which we need to detect sets of highly diverse or non-diverse initial samples. We initially found, to our surprise, that diversity did not appear to affect BO, neither helping nor hurting the optimizer’s convergence. However, follow-on experiments illuminated a key trade-off. Non-diverse initial samples hastened posterior convergence for the underlying model hyper-parameters—a model building advantage. In contrast, diverse initial samples accelerated exploring the function itself—a space exploration advantage. Both advantages help BO, but in different ways, and the initial sample diversity directly modulates how BO trades those advantages. Indeed, we show that fixing the BO hyper-parameters removes the model building advantage, causing diverse initial samples to always outperform models trained with non-diverse samples. These findings shed light on why, at least for BO-type optimizers, the use of diversity has mixed effects and cautions against the ubiquitous use of space-filling initializations in BO. To the extent that humans use explore-exploit search strategies similar to BO, our results provide a testable conjecture for why and when diversity may affect human-subject or design team experiments.
more » « less
Full Text Available
Freeform Templates: Combining Freeform Curation with Structured Templates

https://doi.org/10.1145/3591196.3593337

MacNeil, Stephen; Huang, Ziheng; Chen, Kenneth; Ding, Zijian; Yu, Alexander; Nakai, Kendall; Dow, Steven P (June 2023, ACM)

Full Text Available
An Intermediate Language for General Sparse Format Customization

https://doi.org/10.1109/LCA.2023.3262610

Liu, Jie; Zhao, Zhongyuan; Ding, Zijian; Brock, Benjamin; Rong, Hongbo; Zhang, Zhiru (January 2023, IEEE Computer Architecture Letters)

Full Text Available
Empirical Study on the Acceleration/Deceleration Constraints Under Commercial Adaptive Cruise Control

https://doi.org/10.1109/ITSC55140.2022.9921922

Zhou, Hao; Zhou, Anye; Ding, Zijian; Laval, Jorge; Peeta, Srinivas (October 2022, IEEE ITSC)

« Prev Next »

Search for: All records